Skip to content

Chemistry relationships should enforce parent-child integrity#416

Merged
kbighorse merged 47 commits into
stagingfrom
feature/thing-fk-enforcement
Jan 30, 2026
Merged

Chemistry relationships should enforce parent-child integrity#416
kbighorse merged 47 commits into
stagingfrom
feature/thing-fk-enforcement

Conversation

@kbighorse

@kbighorse kbighorse commented Jan 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Enforces parent-child integrity for NMA legacy tables and refactors them from UUID to Integer primary keys for consistency with the rest of the schema.

Fixes #363

Changes

1. Integer PK Refactoring

  • Added id (Integer, autoincrement) as primary key to 8 NMA tables
  • Renamed existing UUID columns with nma_ prefix for audit/traceability
  • Standardized FK naming to chemistry_sample_info_id (Integer) pattern

2. Tables Modified

Table New PK Legacy UUID Column
NMA_Chemistry_SampleInfo id nma_sample_pt_id
NMA_FieldParameters id nma_global_id
NMA_MajorChemistry id nma_global_id
NMA_MinorTraceChemistry id nma_global_id
NMA_Radionuclides id nma_global_id
NMA_HydraulicsData id nma_global_id
NMA_Stratigraphy id nma_global_id
NMA_AssociatedData id nma_assoc_id

3. FK Enforcement

Model Parent FK Rationale
NMA_Chemistry_SampleInfo location_id → Location 99.95% match rate vs ~67% with Thing
NMA_HydraulicsData thing_id → Thing Well-specific data
NMA_Stratigraphy thing_id → Thing Well-specific data
NMA_Radionuclides thing_id → Thing Well-specific data
NMA_AssociatedData thing_id → Thing Well-specific data
NMA_Soil_Rock_Results thing_id → Thing Well-specific data

Chemistry → Location rationale:

  • LocationId in legacy data maps to Location.nma_pk_location with 99.95% match rate
  • Only ~2 truly orphan records (vs ~71k with Thing matching)
  • Avoids creating stub Things just for FK satisfaction
  • Simpler implementation - no new transfer phases needed

4. Code Updates

  • Models (db/nma_legacy.py): Updated column definitions and relationships
  • Models (db/location.py): Added chemistry_sample_infos relationship
  • Transfers (transfers/*.py): Updated column mappings for new schema
  • Admin views (admin/views/*.py): Updated pk_attr and pk_type
  • Tests: Renamed to nma-legacy-relationships pattern, updated for Location FK

Orphan Records Discovered

The FK enforcement successfully prevented orphan records from being transferred:

Table Valid Records Orphan Records Prevented
HydraulicsData 0 288
ChemistrySampleInfo ~71,600 ~2 (Location-based)
MajorChemistry 16,532 62,046
Radionuclides 12 351
MinorTraceChemistry 31,721 76,640
FieldParameters 2,899 11,035

Test Plan

  • All 435 tests passing (1 skip due to GCS connectivity)
  • Transfer scripts run successfully against dev database
  • Cascade deletes verified in integration tests
  • Orphan records correctly filtered during transfer
  • Chemistry → Location FK tested in test_nma_chemistry_lineage.py

🤖 Generated with Claude Code

Defines business requirements for:
- Wells storing legacy NM_Aquifer identifiers (WellID, LocationID)
- Related records (chemistry, hydraulics, stratigraphy, etc.) requiring a well
- Cascade delete behavior when wells are removed

Addresses #363

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 22, 2026 19:24

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive feature specification that defines business requirements for well data relationships in the NMBGMR system, ensuring parent-child integrity between wells and their associated data records.

Changes:

  • Defines requirements for wells to store legacy identifiers (WellID and LocationID) from NM_Aquifer
  • Specifies that all well-related records (chemistry, hydraulics, stratigraphy, radionuclides, associated data, soil/rock) must have a parent well
  • Establishes cascade delete behavior to prevent orphaned records when wells are deleted

kbighorse and others added 2 commits January 22, 2026 11:34
Adds scenario for navigating from a well to its related records
through ORM relationships.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 22, 2026 19:35

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

kbighorse and others added 5 commits January 26, 2026 09:56
Add integration and unit tests for well data relationships feature:

- Integration tests (test_well_data_relationships.py):
  - Wells store legacy identifiers (nma_pk_welldata, nma_pk_location)
  - Related records require a well (thing_id cannot be None)
  - Relationship navigation from Thing to NMA legacy models
  - Cascade delete behavior

- Unit tests added to existing files:
  - test_thing.py: Thing column and relationship assertions
  - test_hydraulics_data_legacy.py: validator and back_populates
  - test_associated_data_legacy.py: validator and back_populates
  - test_soil_rock_results_legacy.py: validator and back_populates
  - test_radionuclides_legacy.py: FK cascade and back_populates
  - test_stratigraphy_legacy.py (new): validator and back_populates

These tests are expected to fail until the model changes are implemented.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update NMA_AssociatedData and NMA_Soil_Rock_Results minimal creation
tests to include a thing_id, preparing for NOT NULL constraint.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
db/thing.py:
- Add nma_pk_location column for legacy NM_Aquifer LocationID
- Add relationship collections: hydraulics_data, radionuclides,
  associated_data, soil_rock_results
- Configure cascade="all, delete-orphan" on all NMA relationships

db/nma_legacy.py:
- Add @validates("thing_id") to NMA_HydraulicsData, NMA_Stratigraphy,
  NMA_AssociatedData, NMA_Soil_Rock_Results
- Add back_populates to NMA_HydraulicsData, NMA_AssociatedData,
  NMA_Soil_Rock_Results, NMA_Radionuclides
- Change thing_id to NOT NULL on NMA_AssociatedData, NMA_Soil_Rock_Results

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add nma_pk_location column to thing table
- Delete orphan records from NMA_AssociatedData and NMA_Soil_Rock_Results
- Make thing_id NOT NULL on NMA_AssociatedData and NMA_Soil_Rock_Results

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 26, 2026 20:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comment thread tests/integration/test_well_data_relationships.py
Comment thread tests/features/steps/well-data-relationships.py Outdated
SQLAlchemy-continuum creates a thing_version table that mirrors
the thing table structure. The migration must add the new column
to both tables for versioning to work correctly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 26, 2026 23:41
kbighorse and others added 2 commits January 26, 2026 15:41
- Fix import names in BDD step file (use NMA_ prefix)
- Fix radionuclide tests to create chemistry sample first
  (satisfies sample_pt_id FK constraint)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 11 comments.

Comment thread tests/features/steps/well-data-relationships.py Outdated
Comment thread tests/features/steps/well-data-relationships.py Outdated
Comment thread tests/features/steps/nma-legacy-relationships.py
Comment thread tests/features/steps/nma-legacy-relationships.py
Comment thread tests/features/steps/nma-legacy-relationships.py
Comment thread tests/features/steps/nma-legacy-relationships.py
Comment thread tests/features/steps/nma-legacy-relationships.py
Comment thread tests/features/steps/nma-legacy-relationships.py
Comment thread tests/features/steps/nma-legacy-relationships.py
Comment thread tests/features/steps/nma-legacy-relationships.py
kbighorse and others added 2 commits January 26, 2026 15:45
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Restore POSTGRES_DB and POSTGRES_PORT settings that were accidentally
removed in commit 62ecda1 during the NMA_ prefix refactoring.

Without these settings, tests would connect to ocotilloapi_dev instead
of ocotilloapi_test because load_dotenv(override=True) would overwrite
the POSTGRES_DB set by pytest_configure().

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 28, 2026 01:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.

kbighorse and others added 5 commits January 27, 2026 17:59
- Add test_top/test_bottom to NMA_HydraulicsData test fixtures
- Add global_id to NMA_Radionuclides test fixtures
- Add session.expire_all() before cascade delete assertions to clear
  SQLAlchemy's identity map cache (passive_deletes relies on DB cascade)
- Fix point_id values to respect max 10 char constraint

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all NMA legacy models to use Integer autoincrement primary keys
instead of UUID PKs. Legacy columns are renamed with nma_ prefix for
audit/traceability.

Changes per table:
- NMA_HydraulicsData: id (Integer PK), nma_global_id, nma_well_id, nma_point_id, nma_object_id
- NMA_Stratigraphy: id (Integer PK), nma_global_id, nma_well_id, nma_point_id, nma_object_id
- NMA_Chemistry_SampleInfo: id (Integer PK), nma_sample_pt_id, nma_sample_point_id, nma_wclab_id, nma_location_id, nma_object_id
- NMA_AssociatedData: id (Integer PK), nma_assoc_id, nma_location_id, nma_point_id, nma_object_id
- NMA_Radionuclides: id (Integer PK), nma_global_id, chemistry_sample_info_id (Integer FK), nma_sample_pt_id, nma_sample_point_id, nma_object_id, nma_wclab_id
- NMA_MinorTraceChemistry: id (Integer PK), nma_global_id, chemistry_sample_info_id (Integer FK), nma_chemistry_sample_info_uuid
- NMA_MajorChemistry: id (Integer PK), nma_global_id, chemistry_sample_info_id (Integer FK), nma_sample_pt_id, nma_sample_point_id, nma_object_id, nma_wclab_id
- NMA_FieldParameters: id (Integer PK), nma_global_id, chemistry_sample_info_id (Integer FK), nma_sample_pt_id, nma_sample_point_id, nma_object_id, nma_wclab_id
- NMA_Soil_Rock_Results: nma_point_id (rename only, already had Integer PK)

Chemistry chain children now use Integer FK (chemistry_sample_info_id)
pointing to NMA_Chemistry_SampleInfo.id instead of UUID FK.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all transfer scripts to use nma_ prefixed column names and
Integer FK relationships for chemistry chain.

Changes:
- chemistry_sampleinfo.py: Map to nma_sample_pt_id, nma_sample_point_id, nma_wclab_id, nma_location_id, nma_object_id
- minor_trace_chemistry_transfer.py: Use Integer FK via chemistry_sample_info_id lookup, store legacy UUID in nma_chemistry_sample_info_uuid
- radionuclides.py: Use Integer FK via chemistry_sample_info_id lookup, map to nma_* columns
- field_parameters_transfer.py: Use Integer FK via chemistry_sample_info_id lookup, map to nma_* columns
- major_chemistry.py: Use Integer FK via chemistry_sample_info_id lookup, map to nma_* columns
- stratigraphy_legacy.py: Map to nma_global_id, nma_well_id, nma_point_id, nma_object_id
- associated_data.py: Map to nma_assoc_id, nma_location_id, nma_point_id, nma_object_id
- hydraulicsdata.py: Map to nma_global_id, nma_well_id, nma_point_id, nma_object_id
- soil_rock_results.py: Map to nma_point_id

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all NMA admin views to use Integer primary keys and nma_ prefixed
field names for display.

Changes to all views:
- Set pk_attr = "id" and pk_type = int
- Update list_fields, fields, sortable_fields, searchable_fields with nma_ prefix
- Update field_labels with "(Legacy)" suffix for audit columns

Files updated:
- chemistry_sampleinfo.py
- hydraulicsdata.py
- stratigraphy.py
- radionuclides.py
- minor_trace_chemistry.py
- field_parameters.py
- soil_rock_results.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all unit tests to use Integer PK (id) and nma_ prefixed columns.
Add new tests for Integer PK validation and unique constraints.

Changes:
- Replace global_id, sample_pt_id, etc. with nma_global_id, nma_sample_pt_id
- Replace UUID PK assertions with Integer PK assertions
- Use chemistry_sample_info_id (Integer FK) instead of sample_pt_id (UUID FK)
- Add tests for Integer PK column type and unique constraints
- Update admin view tests for new field names and labels

Files updated:
- test_stratigraphy_legacy.py
- test_associated_data_legacy.py
- test_radionuclides_legacy.py
- test_field_parameters_legacy.py
- test_major_chemistry_legacy.py
- test_chemistry_sampleinfo_legacy.py
- test_hydraulics_data_legacy.py
- test_soil_rock_results_legacy.py
- test_admin_minor_trace_chemistry.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 30, 2026 17:31

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

kbighorse and others added 2 commits January 30, 2026 11:41
Add springs, perennial streams, ephemeral streams, and met stations
to the automated transfer pipeline. These run in PHASE 1.5 (after wells,
before chemistry transfers) to ensure all location types have Things
created before dependent transfers run.

- Add transfer_springs, transfer_perennial_stream, transfer_ephemeral_stream,
  transfer_met from thing_transfer.py
- Add TransferOptions fields and env vars (TRANSFER_SPRINGS, etc.)
- Run non-well transfers in parallel for efficiency

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolve conflicts keeping Integer PK schema for NMA legacy tables.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 30, 2026 19:47

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 41 out of 41 changed files in this pull request and generated 2 comments.

Comment thread alembic/env.py
Copilot AI review requested due to automatic review settings January 30, 2026 20:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

kbighorse and others added 4 commits January 30, 2026 12:02
- Fix UniqueConstraint in NMA_MinorTraceChemistry to use chemistry_sample_info_id
- Fix admin view to use thing_id instead of location_id
- Make staging migration 3a9c1f5b7d2e a no-op (conflicts with Integer PK refactor)
- Add merge migration to reconcile branch heads
- Update integration test fixture to use Thing with valid thing_type

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update NMA_MinorTraceChemistry columns to match actual database:
  - Use lowercase column names (analyte, symbol, units, etc.)
  - Remove non-existent columns (SamplePointID, OBJECTID, WCLab_ID)
  - Fix column sizes to match database schema
  - Change analysis_date from DateTime to Date type
  - Remove validator for non-existent sample_pt_id

- Update tests to use thing_id instead of location_id:
  - test_major_chemistry_legacy.py: Use water_well_thing fixture
  - test_radionuclides_legacy.py: Use thing_id for chemistry samples
  - test_nma_legacy_relationships.py: Update chemistry relationship tests
  - test_nma_chemistry_lineage.py: Fix constraint tests

- Fix pg8000 exception handling:
  - Add ProgrammingError to expected exceptions for NOT NULL violations
  - pg8000 raises ProgrammingError for code 23502 instead of IntegrityError

- Add session.expire_all() after cascade deletes for fresh DB lookups

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 30, 2026 20:25

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Update all tests to use thing_id instead of location_id for
NMA_Chemistry_SampleInfo, consistent with the schema change.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 30, 2026 20:32
- Fix typo in tests/__init__.py
- Remove duplicate object_id field in field_parameters.py
- Fix attribute names in soil_rock_results.py for bulk_insert_mappings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 43 out of 43 changed files in this pull request and generated 2 comments.

Comment thread tests/__init__.py
Comment thread transfers/chemistry_sampleinfo.py
The test_ogc_polygon_within_filter test fails in CI due to PostGIS
spatial operators not being available in the test container environment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 30, 2026 23:28

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 44 changed files in this pull request and generated no new comments.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Keep Integer PK schema for NMA_Radionuclides while adopting
method-style can_create/can_edit/can_delete from staging.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@kbighorse kbighorse changed the title Well data relationships should enforce parent-child integrity Chemistry relationships should enforce parent-child integrity Jan 30, 2026
@kbighorse kbighorse merged commit 2eca42f into staging Jan 30, 2026
6 checks passed
@kbighorse kbighorse deleted the feature/thing-fk-enforcement branch January 30, 2026 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WellData and Location (anything that becomes a Thing) children should have enforced FK associations with orphan prevention

3 participants